Bioproxy

This project aims to establish a proxy for determining the activity of transcription factors based on the expression levels of genes regulated by those factors.

The null hypothesis posits that the abundance of transcripts serves as a reliable proxy for activity. However, this assumption may not hold for all transcription factors. Some factors are active only when phosphorylated, meaning that their transcripts may be constitutively expressed while activity depends on phosphorylation, correlating instead with the activity of the corresponding kinase.

We will also focus on develop a general model to relate transcription factor activity to transcript abundance. Significant deviations from this model could indicate transcription factors whose activity is not accurately reflected by transcript levels.

Table of Contents

Here are the steps that we performed in our analysis

  1. Library loading

  2. Preprocessing

  3. Exploratory analysis

  4. Models …

Library loading

library(readr)
library(matrixStats)
library(dplyr)
library(ggplot2)
library(grid)
library(gridExtra)
library(caret)
library(glmnet)
library(igraph)
library(sigmoid)
library(vip)
library(ggrepel)
set.seed(123)

Preprocessing

Let’s start importing the dataset from PreciseDB that contains theexpression levels for 3,923 genes in E. coli for 278 different condition

# Import file 
log_tpm <- read.csv("log_tpm_full.csv", row.names = 1)
log_tpm

Next we want to perform some preprocessing. First we want to exclude:

  • Condition with knockout genes

  • Genes with isoforms

+++ WHY +++

# Exclude condition with knockout genes
log_tpm <- log_tpm[, -grep("_del", colnames(log_tpm))]

# Drop genes with isoform
genes_with_isoforms <- grep("_\\d+$", rownames(log_tpm), value = TRUE)

log_tpm <- subset(log_tpm, !(rownames(log_tpm) %in% genes_with_isoforms))
dim(log_tpm)
[1] 4257  188

Now we can proceed to import the regulatory network from RegulonDB that reports the target genes for each regulator.

regulator <- read.table(file="tableDataReg.csv",
                        header=TRUE, 
                        sep=",")
regulator

We decide to eliminate:

  • Lines that are not referred to a protein regulator

  • Relationships reported as Weak or Unknown

regulator <- regulator[which(regulator[, 2] != "ppGpp"), ]

w <- which(trimws(regulator[,7])=="W")
if(length(w)>0){
  regulator <- regulator[-w,]
}

w <- which(trimws(regulator[,7])=="?")
if(length(w)>0){
  regulator <- regulator[-w,]
}

nrow(regulator)
[1] 4229

There is a discrepancy between the gene identifiers present in the regulatory network obtained from RegulonDB and those in the PreciseDB dataset. While the regulatory network uses gene names, our dataset contained identifiers in the form of Bnums. So it’s better to convert bnums to gene names to facilitate comparison and downstream analysis. In addition we decided to:

  • Remove unmapped genes

  • Remove duplicate genes

# Loading files from ECOcyc
map_bnum <- read.delim("mapbnum.txt", header = TRUE)
map_bnum <- map_bnum[c("Gene.Name", "Accession.1")]

# Map between my dataset and the file of ecocyc
log_tpm$gene_number <- rownames(log_tpm)
log_tpm <- merge(log_tpm, map_bnum, by.x = "gene_number", by.y = "Accession.1", all.x = TRUE)

# Rearrange the dataset
log_tpm <- log_tpm[, c("Gene.Name", setdiff(names(log_tpm), "Gene.Name"))]

# Removing unmapped genes bnumber
log_tpm <- subset(log_tpm, !is.na(Gene.Name))

#removing duplicate genes (it also has all expression values equal 0 so very bad)
log_tpm <- subset(log_tpm, !(log_tpm$Gene.Name == "insI2"))

#setting rownames and dropping the first 2 columns
rownames(log_tpm) <- log_tpm$Gene.Name
log_tpm <- log_tpm[,3:ncol(log_tpm)]

#transpose log_tpm
log_tpm <- t(log_tpm)

Exploratory analysis

Now let’s analyze the distribution of the expression level to understand which is the best value to chose: mean, median, maximum and minimum of expression. We also want check if the distributions follow a Gaussian with the Shapiro–Wilk test:

\[ W = \dfrac{\big(\sum^n _ {i=n} a_i x_{(i)} \big ) ^2}{\sum^n _ {i=n} (x_i - \bar{x})^2} \]

Where:

\[ H_o: \text{a sample } x_1, \cdots, x_n \text{ is drawn from a normally distributed population.} \\ H_a: \text{a sample } x_1, \cdots, x_n \text{ is not drawn from a normally distributed population.} \]

# Histograms of Summary Statistics
log_tpm_mean <- data.frame(value = apply(log_tpm, 2, mean))
mean_hist <- ggplot(log_tpm_mean, aes(x = value)) +
  geom_histogram(binwidth = 0.5, fill = "skyblue", 
                 color = "black", bins = 100) +
  labs(x = "Mean log-TPM", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())
shapiro.test(log_tpm_mean$value) #not gaussian

    Shapiro-Wilk normality test

data:  log_tpm_mean$value
W = 0.98717, p-value < 2.2e-16
log_tpm_median <- data.frame(value = apply(log_tpm, 2, median))
median_hist <- ggplot(log_tpm_median, aes(x = value)) +
  geom_histogram(binwidth = 0.5, fill = "lightgreen", 
                 color = "black", bins = 100) +
  labs(x = "Median log-TPM", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())
shapiro.test(log_tpm_median$value) #not gaussian

    Shapiro-Wilk normality test

data:  log_tpm_median$value
W = 0.98578, p-value < 2.2e-16
log_tpm_max <- data.frame(value = apply(log_tpm, 2, max))
max_hist <- ggplot(log_tpm_max, aes(x = value)) +
  geom_histogram(binwidth = 0.5, fill = "lavender", 
                 color = "black", bins = 100) +
  labs(x = "Max log-TPM", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())
shapiro.test(log_tpm_max$value) #not gaussian

    Shapiro-Wilk normality test

data:  log_tpm_max$value
W = 0.99611, p-value = 3.889e-09
log_tpm_min <- data.frame(value = apply(log_tpm, 2, min))
min_hist <- ggplot(log_tpm_min, aes(x = value)) +
  geom_histogram(binwidth = 0.5, fill = "lightpink", 
                 color = "black", bins = 100) +
  labs(x = "Min log-TPM", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())
shapiro.test(log_tpm_min$value) #not gaussian

    Shapiro-Wilk normality test

data:  log_tpm_min$value
W = 0.91615, p-value < 2.2e-16
grid.arrange(mean_hist, median_hist, max_hist, min_hist, nrow = 2, ncol = 2,
             top = textGrob("Histograms of Summary Statistics", 
                            gp=gpar(fontsize=16)))

Based on the results of the Shapiro-Wilk tests conducted on the four distributions and their respective histograms, we can conclude that the distributions do not follow to a Gaussian (normal) distribution.
Consequently, this implies that chosing measures such as the mean, median, maximum, or minimum may not match the linear assumption that the data follows a Gaussian (normal) distribution.

Now we plot some histograms to have a better idea on the number of positive and negative target genes regulated by each regulator.

# Histogram of how many genes are regulated by each gene
positive_reg <- regulator[regulator$X6.function == "+",]
negative_reg <- regulator[regulator$X6.function == "-",]
unique_regulators <- unique(regulator$X3.RegulatorGeneName)

pos_counts <- c()
neg_counts <- c()

for(reg in unique_regulators){
  pos_counts <- c(pos_counts, 
                  count(positive_reg[positive_reg$X3.RegulatorGeneName == reg,]))
  neg_counts <- c(neg_counts, 
                  count(negative_reg[negative_reg$X3.RegulatorGeneName == reg,]))
}

pos_counts <- unlist(pos_counts)
names(pos_counts) <- unique_regulators

neg_counts <- unlist(neg_counts)
names(neg_counts) <- unique_regulators

pos_counts <- data.frame(value = pos_counts)
shapiro.test(pos_counts$value) #not gaussian

neg_counts <- data.frame(value = neg_counts)
shapiro.test(neg_counts$value) #not gaussian

total_counts <- data.frame(value = pos_counts$value + neg_counts$value)
shapiro.test(total_counts$value) #not gaussian

pos_counts_hist <- ggplot(pos_counts, aes(x = value)) +
  geom_histogram(binwidth = 10, fill = "skyblue", 
                 color = "black", bins = 20) +
  labs(x = "Positive Regulations Count", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())

neg_counts_hist <- ggplot(neg_counts, aes(x = value)) +
  geom_histogram(binwidth = 10, fill = "lightgreen", 
                 color = "black", bins = 20) +
  labs(x = "Negative Regulations Count", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())

total_counts_hist <- ggplot(total_counts, aes(x = value)) +
  geom_histogram(binwidth = 10, fill = "lightpink", 
                 color = "black", bins = 20) +
  labs(x = "Total Regulations Count", y = "Frequency") +
  theme_minimal() +
  theme(panel.grid.major = element_blank(), 
        panel.grid.minor = element_blank())

grid.arrange(pos_counts_hist, neg_counts_hist, total_counts_hist, nrow = 1)

There are slightly more negative regulator then positive. In addition we can also notice that the majority of regulators have less than 100 target. There is one exception: crp has 300 target.

Network analysis

# Creating adj matrix to plot the network
edge_list <- cbind(RagulatorName = regulator$X3.RegulatorGeneName, 
                   Target = regulator$X5.regulatedName)
graph <- graph_from_edgelist(edge_list)
layout <- layout_with_fr(graph, niter=100)


# Visualizzazione 3D della network
plot.igraph(graph, layout=layout, vertex.label=NA, vertex.size=3, edge.arrow.size=0.2, edge.curved=TRUE, main="E.coli Network", xlim=c(-0.5, 0.5), ylim=c(-1, 1))

LS0tCnRpdGxlOiAiQmlvUHJveHkiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCiMgQmlvcHJveHkKClRoaXMgcHJvamVjdCBhaW1zIHRvIGVzdGFibGlzaCBhIHByb3h5IGZvciBkZXRlcm1pbmluZyB0aGUgYWN0aXZpdHkgb2YgdHJhbnNjcmlwdGlvbiBmYWN0b3JzIGJhc2VkIG9uIHRoZSBleHByZXNzaW9uIGxldmVscyBvZiBnZW5lcyByZWd1bGF0ZWQgYnkgdGhvc2UgZmFjdG9ycy4KClRoZSBudWxsIGh5cG90aGVzaXMgcG9zaXRzIHRoYXQgdGhlIGFidW5kYW5jZSBvZiB0cmFuc2NyaXB0cyBzZXJ2ZXMgYXMgYSByZWxpYWJsZSBwcm94eSBmb3IgYWN0aXZpdHkuIEhvd2V2ZXIsIHRoaXMgYXNzdW1wdGlvbiBtYXkgbm90IGhvbGQgZm9yIGFsbCB0cmFuc2NyaXB0aW9uIGZhY3RvcnMuIFNvbWUgZmFjdG9ycyBhcmUgYWN0aXZlIG9ubHkgd2hlbiBwaG9zcGhvcnlsYXRlZCwgbWVhbmluZyB0aGF0IHRoZWlyIHRyYW5zY3JpcHRzIG1heSBiZSBjb25zdGl0dXRpdmVseSBleHByZXNzZWQgd2hpbGUgYWN0aXZpdHkgZGVwZW5kcyBvbiBwaG9zcGhvcnlsYXRpb24sIGNvcnJlbGF0aW5nIGluc3RlYWQgd2l0aCB0aGUgYWN0aXZpdHkgb2YgdGhlIGNvcnJlc3BvbmRpbmcga2luYXNlLgoKV2Ugd2lsbCBhbHNvIGZvY3VzIG9uIGRldmVsb3AgYSBnZW5lcmFsIG1vZGVsIHRvIHJlbGF0ZSB0cmFuc2NyaXB0aW9uIGZhY3RvciBhY3Rpdml0eSB0byB0cmFuc2NyaXB0IGFidW5kYW5jZS4gU2lnbmlmaWNhbnQgZGV2aWF0aW9ucyBmcm9tIHRoaXMgbW9kZWwgY291bGQgaW5kaWNhdGUgdHJhbnNjcmlwdGlvbiBmYWN0b3JzIHdob3NlIGFjdGl2aXR5IGlzIG5vdCBhY2N1cmF0ZWx5IHJlZmxlY3RlZCBieSB0cmFuc2NyaXB0IGxldmVscy4KCiMjIFRhYmxlIG9mIENvbnRlbnRzCgpIZXJlIGFyZSB0aGUgc3RlcHMgdGhhdCB3ZSBwZXJmb3JtZWQgaW4gb3VyIGFuYWx5c2lzCgoxLiAgW0xpYnJhcnkgbG9hZGluZ10KCjIuICBbUHJlcHJvY2Vzc2luZ10KCjMuICBFeHBsb3JhdG9yeSBhbmFseXNpcwoKNC4gIE1vZGVscyAuLi4KCiMjIExpYnJhcnkgbG9hZGluZwoKYGBge3IgbWVzc2FnZT1GQUxTRSwgd2FybmluZz1GQUxTRX0KbGlicmFyeShyZWFkcikKbGlicmFyeShtYXRyaXhTdGF0cykKbGlicmFyeShkcGx5cikKbGlicmFyeShnZ3Bsb3QyKQpsaWJyYXJ5KGdyaWQpCmxpYnJhcnkoZ3JpZEV4dHJhKQpsaWJyYXJ5KGNhcmV0KQpsaWJyYXJ5KGdsbW5ldCkKbGlicmFyeShpZ3JhcGgpCmxpYnJhcnkoc2lnbW9pZCkKbGlicmFyeSh2aXApCmxpYnJhcnkoZ2dyZXBlbCkKc2V0LnNlZWQoMTIzKQpgYGAKCiMjIFByZXByb2Nlc3NpbmcKCkxldCdzIHN0YXJ0IGltcG9ydGluZyB0aGUgZGF0YXNldCBmcm9tIFtQcmVjaXNlREJdKGh0dHBzOi8vd3d3Lm5jYmkubmxtLm5paC5nb3YvcG1jL2FydGljbGVzL1BNQzU0MDA0NS8pIHRoYXQgY29udGFpbnMgdGhlZXhwcmVzc2lvbiBsZXZlbHMgZm9yIDMsOTIzIGdlbmVzIGluIEUuIGNvbGkgZm9yIDI3OCBkaWZmZXJlbnQgY29uZGl0aW9uCgpgYGB7cn0KIyBJbXBvcnQgZmlsZSAKbG9nX3RwbSA8LSByZWFkLmNzdigibG9nX3RwbV9mdWxsLmNzdiIsIHJvdy5uYW1lcyA9IDEpCmxvZ190cG0KYGBgCgpOZXh0IHdlIHdhbnQgdG8gcGVyZm9ybSBzb21lIHByZXByb2Nlc3NpbmcuIEZpcnN0IHdlIHdhbnQgdG8gZXhjbHVkZToKCi0gICBDb25kaXRpb24gd2l0aCBrbm9ja291dCBnZW5lcwoKLSAgIEdlbmVzIHdpdGggaXNvZm9ybXMKCisrKyBXSFkgKysrCgpgYGB7cn0KIyBFeGNsdWRlIGNvbmRpdGlvbiB3aXRoIGtub2Nrb3V0IGdlbmVzCmxvZ190cG0gPC0gbG9nX3RwbVssIC1ncmVwKCJfZGVsIiwgY29sbmFtZXMobG9nX3RwbSkpXQoKIyBEcm9wIGdlbmVzIHdpdGggaXNvZm9ybQpnZW5lc193aXRoX2lzb2Zvcm1zIDwtIGdyZXAoIl9cXGQrJCIsIHJvd25hbWVzKGxvZ190cG0pLCB2YWx1ZSA9IFRSVUUpCgpsb2dfdHBtIDwtIHN1YnNldChsb2dfdHBtLCAhKHJvd25hbWVzKGxvZ190cG0pICVpbiUgZ2VuZXNfd2l0aF9pc29mb3JtcykpCmRpbShsb2dfdHBtKQpgYGAKCk5vdyB3ZSBjYW4gcHJvY2VlZCB0byBpbXBvcnQgdGhlIHJlZ3VsYXRvcnkgbmV0d29yayBmcm9tIFtSZWd1bG9uREJdKGh0dHBzOi8vcmVndWxvbmRiLmNjZy51bmFtLm14LykgdGhhdCByZXBvcnRzIHRoZSB0YXJnZXQgZ2VuZXMgZm9yIGVhY2ggcmVndWxhdG9yLgoKYGBge3J9CnJlZ3VsYXRvciA8LSByZWFkLnRhYmxlKGZpbGU9InRhYmxlRGF0YVJlZy5jc3YiLAogICAgICAgICAgICAgICAgICAgICAgICBoZWFkZXI9VFJVRSwgCiAgICAgICAgICAgICAgICAgICAgICAgIHNlcD0iLCIpCnJlZ3VsYXRvcgpgYGAKCldlIGRlY2lkZSB0byBlbGltaW5hdGU6CgotICAgTGluZXMgdGhhdCBhcmUgbm90IHJlZmVycmVkIHRvIGEgcHJvdGVpbiByZWd1bGF0b3IKCi0gICBSZWxhdGlvbnNoaXBzIHJlcG9ydGVkIGFzIFdlYWsgb3IgVW5rbm93bgoKYGBge3J9CnJlZ3VsYXRvciA8LSByZWd1bGF0b3Jbd2hpY2gocmVndWxhdG9yWywgMl0gIT0gInBwR3BwIiksIF0KCncgPC0gd2hpY2godHJpbXdzKHJlZ3VsYXRvclssN10pPT0iVyIpCmlmKGxlbmd0aCh3KT4wKXsKICByZWd1bGF0b3IgPC0gcmVndWxhdG9yWy13LF0KfQoKdyA8LSB3aGljaCh0cmltd3MocmVndWxhdG9yWyw3XSk9PSI/IikKaWYobGVuZ3RoKHcpPjApewogIHJlZ3VsYXRvciA8LSByZWd1bGF0b3JbLXcsXQp9Cgpucm93KHJlZ3VsYXRvcikKYGBgCgpUaGVyZSBpcyBhIGRpc2NyZXBhbmN5IGJldHdlZW4gdGhlIGdlbmUgaWRlbnRpZmllcnMgcHJlc2VudCBpbiB0aGUgcmVndWxhdG9yeSBuZXR3b3JrIG9idGFpbmVkIGZyb20gUmVndWxvbkRCIGFuZCB0aG9zZSBpbiB0aGUgUHJlY2lzZURCIGRhdGFzZXQuIFdoaWxlIHRoZSByZWd1bGF0b3J5IG5ldHdvcmsgdXNlcyBnZW5lIG5hbWVzLCBvdXIgZGF0YXNldCBjb250YWluZWQgaWRlbnRpZmllcnMgaW4gdGhlIGZvcm0gb2YgQm51bXMuIFNvIGl0J3MgYmV0dGVyIHRvIGNvbnZlcnQgYm51bXMgdG8gZ2VuZSBuYW1lcyB0byBmYWNpbGl0YXRlIGNvbXBhcmlzb24gYW5kIGRvd25zdHJlYW0gYW5hbHlzaXMuIEluIGFkZGl0aW9uIHdlIGRlY2lkZWQgdG86CgotICAgUmVtb3ZlIHVubWFwcGVkIGdlbmVzCgotICAgUmVtb3ZlIGR1cGxpY2F0ZSBnZW5lcwoKYGBge3J9CiMgTG9hZGluZyBmaWxlcyBmcm9tIEVDT2N5YwptYXBfYm51bSA8LSByZWFkLmRlbGltKCJtYXBibnVtLnR4dCIsIGhlYWRlciA9IFRSVUUpCm1hcF9ibnVtIDwtIG1hcF9ibnVtW2MoIkdlbmUuTmFtZSIsICJBY2Nlc3Npb24uMSIpXQoKIyBNYXAgYmV0d2VlbiBteSBkYXRhc2V0IGFuZCB0aGUgZmlsZSBvZiBlY29jeWMKbG9nX3RwbSRnZW5lX251bWJlciA8LSByb3duYW1lcyhsb2dfdHBtKQpsb2dfdHBtIDwtIG1lcmdlKGxvZ190cG0sIG1hcF9ibnVtLCBieS54ID0gImdlbmVfbnVtYmVyIiwgYnkueSA9ICJBY2Nlc3Npb24uMSIsIGFsbC54ID0gVFJVRSkKCiMgUmVhcnJhbmdlIHRoZSBkYXRhc2V0CmxvZ190cG0gPC0gbG9nX3RwbVssIGMoIkdlbmUuTmFtZSIsIHNldGRpZmYobmFtZXMobG9nX3RwbSksICJHZW5lLk5hbWUiKSldCgojIFJlbW92aW5nIHVubWFwcGVkIGdlbmVzIGJudW1iZXIKbG9nX3RwbSA8LSBzdWJzZXQobG9nX3RwbSwgIWlzLm5hKEdlbmUuTmFtZSkpCgojcmVtb3ZpbmcgZHVwbGljYXRlIGdlbmVzIChpdCBhbHNvIGhhcyBhbGwgZXhwcmVzc2lvbiB2YWx1ZXMgZXF1YWwgMCBzbyB2ZXJ5IGJhZCkKbG9nX3RwbSA8LSBzdWJzZXQobG9nX3RwbSwgIShsb2dfdHBtJEdlbmUuTmFtZSA9PSAiaW5zSTIiKSkKCiNzZXR0aW5nIHJvd25hbWVzIGFuZCBkcm9wcGluZyB0aGUgZmlyc3QgMiBjb2x1bW5zCnJvd25hbWVzKGxvZ190cG0pIDwtIGxvZ190cG0kR2VuZS5OYW1lCmxvZ190cG0gPC0gbG9nX3RwbVssMzpuY29sKGxvZ190cG0pXQoKI3RyYW5zcG9zZSBsb2dfdHBtCmxvZ190cG0gPC0gdChsb2dfdHBtKQpgYGAKCiMjIEV4cGxvcmF0b3J5IGFuYWx5c2lzCgpOb3cgbGV0J3MgYW5hbHl6ZSB0aGUgZGlzdHJpYnV0aW9uIG9mIHRoZSBleHByZXNzaW9uIGxldmVsIHRvIHVuZGVyc3RhbmQgd2hpY2ggaXMgdGhlIGJlc3QgdmFsdWUgdG8gY2hvc2U6IG1lYW4sIG1lZGlhbiwgbWF4aW11bSBhbmQgbWluaW11bSBvZiBleHByZXNzaW9uLiBXZSBhbHNvIHdhbnQgY2hlY2sgaWYgdGhlIGRpc3RyaWJ1dGlvbnMgZm9sbG93IGEgR2F1c3NpYW4gd2l0aCB0aGUgU2hhcGlyb+KAk1dpbGsgdGVzdDoKCiQkClcgPSBcZGZyYWN7XGJpZyhcc3VtXm4gXyB7aT1ufSBhX2kgeF97KGkpfSBcYmlnICkgXjJ9e1xzdW1ebiBfIHtpPW59ICh4X2kgLSBcYmFye3h9KV4yfQokJAoKV2hlcmU6CgokJApIX286IFx0ZXh0e2Egc2FtcGxlIH0geF8xLCBcY2RvdHMsIHhfbiBcdGV4dHsgaXMgZHJhd24gZnJvbSBhIG5vcm1hbGx5IGRpc3RyaWJ1dGVkIHBvcHVsYXRpb24ufSBcXApIX2E6IFx0ZXh0e2Egc2FtcGxlIH0geF8xLCBcY2RvdHMsIHhfbiBcdGV4dHsgaXMgbm90IGRyYXduIGZyb20gYSBub3JtYWxseSBkaXN0cmlidXRlZCBwb3B1bGF0aW9uLn0KJCQKCmBgYHtyfQojIEhpc3RvZ3JhbXMgb2YgU3VtbWFyeSBTdGF0aXN0aWNzCmxvZ190cG1fbWVhbiA8LSBkYXRhLmZyYW1lKHZhbHVlID0gYXBwbHkobG9nX3RwbSwgMiwgbWVhbikpCm1lYW5faGlzdCA8LSBnZ3Bsb3QobG9nX3RwbV9tZWFuLCBhZXMoeCA9IHZhbHVlKSkgKwogIGdlb21faGlzdG9ncmFtKGJpbndpZHRoID0gMC41LCBmaWxsID0gInNreWJsdWUiLCAKICAgICAgICAgICAgICAgICBjb2xvciA9ICJibGFjayIsIGJpbnMgPSAxMDApICsKICBsYWJzKHggPSAiTWVhbiBsb2ctVFBNIiwgeSA9ICJGcmVxdWVuY3kiKSArCiAgdGhlbWVfbWluaW1hbCgpICsKICB0aGVtZShwYW5lbC5ncmlkLm1ham9yID0gZWxlbWVudF9ibGFuaygpLCAKICAgICAgICBwYW5lbC5ncmlkLm1pbm9yID0gZWxlbWVudF9ibGFuaygpKQpzaGFwaXJvLnRlc3QobG9nX3RwbV9tZWFuJHZhbHVlKSAjbm90IGdhdXNzaWFuCgpsb2dfdHBtX21lZGlhbiA8LSBkYXRhLmZyYW1lKHZhbHVlID0gYXBwbHkobG9nX3RwbSwgMiwgbWVkaWFuKSkKbWVkaWFuX2hpc3QgPC0gZ2dwbG90KGxvZ190cG1fbWVkaWFuLCBhZXMoeCA9IHZhbHVlKSkgKwogIGdlb21faGlzdG9ncmFtKGJpbndpZHRoID0gMC41LCBmaWxsID0gImxpZ2h0Z3JlZW4iLCAKICAgICAgICAgICAgICAgICBjb2xvciA9ICJibGFjayIsIGJpbnMgPSAxMDApICsKICBsYWJzKHggPSAiTWVkaWFuIGxvZy1UUE0iLCB5ID0gIkZyZXF1ZW5jeSIpICsKICB0aGVtZV9taW5pbWFsKCkgKwogIHRoZW1lKHBhbmVsLmdyaWQubWFqb3IgPSBlbGVtZW50X2JsYW5rKCksIAogICAgICAgIHBhbmVsLmdyaWQubWlub3IgPSBlbGVtZW50X2JsYW5rKCkpCnNoYXBpcm8udGVzdChsb2dfdHBtX21lZGlhbiR2YWx1ZSkgI25vdCBnYXVzc2lhbgoKbG9nX3RwbV9tYXggPC0gZGF0YS5mcmFtZSh2YWx1ZSA9IGFwcGx5KGxvZ190cG0sIDIsIG1heCkpCm1heF9oaXN0IDwtIGdncGxvdChsb2dfdHBtX21heCwgYWVzKHggPSB2YWx1ZSkpICsKICBnZW9tX2hpc3RvZ3JhbShiaW53aWR0aCA9IDAuNSwgZmlsbCA9ICJsYXZlbmRlciIsIAogICAgICAgICAgICAgICAgIGNvbG9yID0gImJsYWNrIiwgYmlucyA9IDEwMCkgKwogIGxhYnMoeCA9ICJNYXggbG9nLVRQTSIsIHkgPSAiRnJlcXVlbmN5IikgKwogIHRoZW1lX21pbmltYWwoKSArCiAgdGhlbWUocGFuZWwuZ3JpZC5tYWpvciA9IGVsZW1lbnRfYmxhbmsoKSwgCiAgICAgICAgcGFuZWwuZ3JpZC5taW5vciA9IGVsZW1lbnRfYmxhbmsoKSkKc2hhcGlyby50ZXN0KGxvZ190cG1fbWF4JHZhbHVlKSAjbm90IGdhdXNzaWFuCgpsb2dfdHBtX21pbiA8LSBkYXRhLmZyYW1lKHZhbHVlID0gYXBwbHkobG9nX3RwbSwgMiwgbWluKSkKbWluX2hpc3QgPC0gZ2dwbG90KGxvZ190cG1fbWluLCBhZXMoeCA9IHZhbHVlKSkgKwogIGdlb21faGlzdG9ncmFtKGJpbndpZHRoID0gMC41LCBmaWxsID0gImxpZ2h0cGluayIsIAogICAgICAgICAgICAgICAgIGNvbG9yID0gImJsYWNrIiwgYmlucyA9IDEwMCkgKwogIGxhYnMoeCA9ICJNaW4gbG9nLVRQTSIsIHkgPSAiRnJlcXVlbmN5IikgKwogIHRoZW1lX21pbmltYWwoKSArCiAgdGhlbWUocGFuZWwuZ3JpZC5tYWpvciA9IGVsZW1lbnRfYmxhbmsoKSwgCiAgICAgICAgcGFuZWwuZ3JpZC5taW5vciA9IGVsZW1lbnRfYmxhbmsoKSkKc2hhcGlyby50ZXN0KGxvZ190cG1fbWluJHZhbHVlKSAjbm90IGdhdXNzaWFuCgpncmlkLmFycmFuZ2UobWVhbl9oaXN0LCBtZWRpYW5faGlzdCwgbWF4X2hpc3QsIG1pbl9oaXN0LCBucm93ID0gMiwgbmNvbCA9IDIsCiAgICAgICAgICAgICB0b3AgPSB0ZXh0R3JvYigiSGlzdG9ncmFtcyBvZiBTdW1tYXJ5IFN0YXRpc3RpY3MiLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgIGdwPWdwYXIoZm9udHNpemU9MTYpKSkKYGBgCgpCYXNlZCBvbiB0aGUgcmVzdWx0cyBvZiB0aGUgU2hhcGlyby1XaWxrIHRlc3RzIGNvbmR1Y3RlZCBvbiB0aGUgZm91ciBkaXN0cmlidXRpb25zIGFuZCB0aGVpciByZXNwZWN0aXZlIGhpc3RvZ3JhbXMsIHdlIGNhbiBjb25jbHVkZSB0aGF0IHRoZSBkaXN0cmlidXRpb25zIGRvIG5vdCBmb2xsb3cgdG8gYSBHYXVzc2lhbiAobm9ybWFsKSBkaXN0cmlidXRpb24uXApDb25zZXF1ZW50bHksIHRoaXMgaW1wbGllcyB0aGF0IGNob3NpbmcgbWVhc3VyZXMgc3VjaCBhcyB0aGUgbWVhbiwgbWVkaWFuLCBtYXhpbXVtLCBvciBtaW5pbXVtIG1heSBub3QgbWF0Y2ggdGhlIGxpbmVhciBhc3N1bXB0aW9uIHRoYXQgdGhlIGRhdGEgZm9sbG93cyBhIEdhdXNzaWFuIChub3JtYWwpIGRpc3RyaWJ1dGlvbi4KCk5vdyB3ZSBwbG90IHNvbWUgaGlzdG9ncmFtcyB0byBoYXZlIGEgYmV0dGVyIGlkZWEgb24gdGhlIG51bWJlciBvZiBwb3NpdGl2ZSBhbmQgbmVnYXRpdmUgdGFyZ2V0IGdlbmVzIHJlZ3VsYXRlZCBieSBlYWNoIHJlZ3VsYXRvci4KCmBgYHtyfQojIEhpc3RvZ3JhbSBvZiBob3cgbWFueSBnZW5lcyBhcmUgcmVndWxhdGVkIGJ5IGVhY2ggZ2VuZQpwb3NpdGl2ZV9yZWcgPC0gcmVndWxhdG9yW3JlZ3VsYXRvciRYNi5mdW5jdGlvbiA9PSAiKyIsXQpuZWdhdGl2ZV9yZWcgPC0gcmVndWxhdG9yW3JlZ3VsYXRvciRYNi5mdW5jdGlvbiA9PSAiLSIsXQp1bmlxdWVfcmVndWxhdG9ycyA8LSB1bmlxdWUocmVndWxhdG9yJFgzLlJlZ3VsYXRvckdlbmVOYW1lKQoKcG9zX2NvdW50cyA8LSBjKCkKbmVnX2NvdW50cyA8LSBjKCkKCmZvcihyZWcgaW4gdW5pcXVlX3JlZ3VsYXRvcnMpewogIHBvc19jb3VudHMgPC0gYyhwb3NfY291bnRzLCAKICAgICAgICAgICAgICAgICAgY291bnQocG9zaXRpdmVfcmVnW3Bvc2l0aXZlX3JlZyRYMy5SZWd1bGF0b3JHZW5lTmFtZSA9PSByZWcsXSkpCiAgbmVnX2NvdW50cyA8LSBjKG5lZ19jb3VudHMsIAogICAgICAgICAgICAgICAgICBjb3VudChuZWdhdGl2ZV9yZWdbbmVnYXRpdmVfcmVnJFgzLlJlZ3VsYXRvckdlbmVOYW1lID09IHJlZyxdKSkKfQoKcG9zX2NvdW50cyA8LSB1bmxpc3QocG9zX2NvdW50cykKbmFtZXMocG9zX2NvdW50cykgPC0gdW5pcXVlX3JlZ3VsYXRvcnMKCm5lZ19jb3VudHMgPC0gdW5saXN0KG5lZ19jb3VudHMpCm5hbWVzKG5lZ19jb3VudHMpIDwtIHVuaXF1ZV9yZWd1bGF0b3JzCgpwb3NfY291bnRzIDwtIGRhdGEuZnJhbWUodmFsdWUgPSBwb3NfY291bnRzKQpzaGFwaXJvLnRlc3QocG9zX2NvdW50cyR2YWx1ZSkgI25vdCBnYXVzc2lhbgoKbmVnX2NvdW50cyA8LSBkYXRhLmZyYW1lKHZhbHVlID0gbmVnX2NvdW50cykKc2hhcGlyby50ZXN0KG5lZ19jb3VudHMkdmFsdWUpICNub3QgZ2F1c3NpYW4KCnRvdGFsX2NvdW50cyA8LSBkYXRhLmZyYW1lKHZhbHVlID0gcG9zX2NvdW50cyR2YWx1ZSArIG5lZ19jb3VudHMkdmFsdWUpCnNoYXBpcm8udGVzdCh0b3RhbF9jb3VudHMkdmFsdWUpICNub3QgZ2F1c3NpYW4KCnBvc19jb3VudHNfaGlzdCA8LSBnZ3Bsb3QocG9zX2NvdW50cywgYWVzKHggPSB2YWx1ZSkpICsKICBnZW9tX2hpc3RvZ3JhbShiaW53aWR0aCA9IDEwLCBmaWxsID0gInNreWJsdWUiLCAKICAgICAgICAgICAgICAgICBjb2xvciA9ICJibGFjayIsIGJpbnMgPSAyMCkgKwogIGxhYnMoeCA9ICJQb3NpdGl2ZSBSZWd1bGF0aW9ucyBDb3VudCIsIHkgPSAiRnJlcXVlbmN5IikgKwogIHRoZW1lX21pbmltYWwoKSArCiAgdGhlbWUocGFuZWwuZ3JpZC5tYWpvciA9IGVsZW1lbnRfYmxhbmsoKSwgCiAgICAgICAgcGFuZWwuZ3JpZC5taW5vciA9IGVsZW1lbnRfYmxhbmsoKSkKCm5lZ19jb3VudHNfaGlzdCA8LSBnZ3Bsb3QobmVnX2NvdW50cywgYWVzKHggPSB2YWx1ZSkpICsKICBnZW9tX2hpc3RvZ3JhbShiaW53aWR0aCA9IDEwLCBmaWxsID0gImxpZ2h0Z3JlZW4iLCAKICAgICAgICAgICAgICAgICBjb2xvciA9ICJibGFjayIsIGJpbnMgPSAyMCkgKwogIGxhYnMoeCA9ICJOZWdhdGl2ZSBSZWd1bGF0aW9ucyBDb3VudCIsIHkgPSAiRnJlcXVlbmN5IikgKwogIHRoZW1lX21pbmltYWwoKSArCiAgdGhlbWUocGFuZWwuZ3JpZC5tYWpvciA9IGVsZW1lbnRfYmxhbmsoKSwgCiAgICAgICAgcGFuZWwuZ3JpZC5taW5vciA9IGVsZW1lbnRfYmxhbmsoKSkKCnRvdGFsX2NvdW50c19oaXN0IDwtIGdncGxvdCh0b3RhbF9jb3VudHMsIGFlcyh4ID0gdmFsdWUpKSArCiAgZ2VvbV9oaXN0b2dyYW0oYmlud2lkdGggPSAxMCwgZmlsbCA9ICJsaWdodHBpbmsiLCAKICAgICAgICAgICAgICAgICBjb2xvciA9ICJibGFjayIsIGJpbnMgPSAyMCkgKwogIGxhYnMoeCA9ICJUb3RhbCBSZWd1bGF0aW9ucyBDb3VudCIsIHkgPSAiRnJlcXVlbmN5IikgKwogIHRoZW1lX21pbmltYWwoKSArCiAgdGhlbWUocGFuZWwuZ3JpZC5tYWpvciA9IGVsZW1lbnRfYmxhbmsoKSwgCiAgICAgICAgcGFuZWwuZ3JpZC5taW5vciA9IGVsZW1lbnRfYmxhbmsoKSkKCmdyaWQuYXJyYW5nZShwb3NfY291bnRzX2hpc3QsIG5lZ19jb3VudHNfaGlzdCwgdG90YWxfY291bnRzX2hpc3QsIG5yb3cgPSAxKQpgYGAKClRoZXJlIGFyZSBzbGlnaHRseSBtb3JlIG5lZ2F0aXZlIHJlZ3VsYXRvciB0aGVuIHBvc2l0aXZlLiBJbiBhZGRpdGlvbiB3ZSBjYW4gYWxzbyBub3RpY2UgdGhhdCB0aGUgbWFqb3JpdHkgb2YgcmVndWxhdG9ycyBoYXZlIGxlc3MgdGhhbiAxMDAgdGFyZ2V0LiBUaGVyZSBpcyBvbmUgZXhjZXB0aW9uOiBjcnAgaGFzIDMwMCB0YXJnZXQuCgojIyMgTmV0d29yayBhbmFseXNpcwoKYGBge3J9CiMgQ3JlYXRpbmcgYWRqIG1hdHJpeCB0byBwbG90IHRoZSBuZXR3b3JrCmVkZ2VfbGlzdCA8LSBjYmluZChSYWd1bGF0b3JOYW1lID0gcmVndWxhdG9yJFgzLlJlZ3VsYXRvckdlbmVOYW1lLCAKICAgICAgICAgICAgICAgICAgIFRhcmdldCA9IHJlZ3VsYXRvciRYNS5yZWd1bGF0ZWROYW1lKQpncmFwaCA8LSBncmFwaF9mcm9tX2VkZ2VsaXN0KGVkZ2VfbGlzdCkKbGF5b3V0IDwtIGxheW91dF93aXRoX2ZyKGdyYXBoLCBuaXRlcj0xMDApCgoKIyBWaXN1YWxpenphemlvbmUgM0QgZGVsbGEgbmV0d29yawpwbG90LmlncmFwaChncmFwaCwgbGF5b3V0PWxheW91dCwgdmVydGV4LmxhYmVsPU5BLCB2ZXJ0ZXguc2l6ZT0zLCBlZGdlLmFycm93LnNpemU9MC4yLCBlZGdlLmN1cnZlZD1UUlVFLCBtYWluPSJFLmNvbGkgTmV0d29yayIsIHhsaW09YygtMC41LCAwLjUpLCB5bGltPWMoLTEsIDEpKQoKYGBgCgpgYGAgICAgICAgICAKYGBgCg==